Links

Are the 1000 genomes variant calls phased?

You can tell when a VCF file contains a phased genotype as the delimiter used in the GT field is a pipe symbol | e.g

#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  HG00096
10   60523  rs148087467    T     G       100     PASS    AC=0;AF=0.01;AFR_AF=0.06;AMR_AF=0.0028;AN=2; GT:GL 0|0:-0.19,-0.46,-2.28

The VCF files produced by the final phase of the 1000 Genomes Project (phase 3) are phased. They can be found in the final release directory from the project and in the directory supporting the final publications.

The majority of the VCF files in official releases over the life time of the project contained phased variants. This is also true for the pilot, phase 1 and final phase 3 data sets.

The phase 1 release files contain global R2 values but you can also use the VCF to plink converter if you wish to use our files with haploview or another similar tool.

Related questions: